Optimum Instruction-level Parallelism (ILP) for Superscalar and VLIW Processors

نویسندگان

Patrick Hung

Michael J. Flynn

چکیده

Modern superscalar and VLIW processors fetch, decode, issue, execute, and retire multiple instructions per cycle. By taking advantage of instruction-level parallelism (ILP), processor performance can be improved substantially. However, increasing the level of ILP may eventually result in diminishing and negative returns due to control and data dependencies among subsequent instructions as well as resource con icts within a processor. Moreover, the additional ILP complexity can have signi cant overhead in cycle time and latency. This technical report uses a generic processor model to investigate the optimum level of ILP for superscalar and VLIW processors.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

For Embedded Applications with Data-level Parallelism, a Vector Processor Offers High Performance at Low Power Consumption and Low Design Complexity. unlike Superscalar and Vliw Designs, a Vector Processor Is Scalable and Can Optimally Match Specific

Designers of embedded processors have typically optimized for low power consumption and low design complexity to minimize cost. Performance was a secondary consideration. Nowadays, many embedded systems (set-top boxes, game consoles, personal digital assistants, and cell phones) commonly perform computation-intensive media tasks such as video processing, speech transcoding, graphics, and high-b...

متن کامل

A Brief Overview on Runtime-Aware Architectures

When uniprocessors were the norm, Instruction Level Parallelism (ILP) and Data Level Parallelism (DLP) were widely exploited to increase the number of instructions executed per cycle. The main hardware designs that were used to exploit ILP were superscalar and Very Long Instruction Word (VLIW) processors. The VLIW approach implies statically figuring out dependencies between instructions and sc...

متن کامل

Scalable Vector Processors for Embedded Systems

متن کامل

A Study of Out-of-Order Completion for the MIPS R10K Superscalar Processor

Instruction level parallelism (ILP) improves performance for VLIW, EPIC, and Superscalar processors. Out-of-order execution improves performance further. The advantage of out-of-order execution is not fully utilized due to in-order completion. In this report we study the performance loss due to in-order completion for MIPS R10000 processor.

متن کامل

Software Pipelining and Superblock Scheduling: Compilation Techniques for VLIW Machines

© Copyright Hewlett-Packard Company 1992 Compilers for VLIW and superscalar processors have to expose instruction-level parallelism to effectively utilize the hardware. Software pipelining is a scheduling technique to overlap successive iterations of loops, while superblock scheduling extracts ILP from frequently executed traces. This paper describes an effort to employ both software pipelining...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 1999

Optimum Instruction-level Parallelism (ILP) for Superscalar and VLIW Processors

نویسندگان

چکیده

منابع مشابه

For Embedded Applications with Data-level Parallelism, a Vector Processor Offers High Performance at Low Power Consumption and Low Design Complexity. unlike Superscalar and Vliw Designs, a Vector Processor Is Scalable and Can Optimally Match Specific

A Brief Overview on Runtime-Aware Architectures

Scalable Vector Processors for Embedded Systems

A Study of Out-of-Order Completion for the MIPS R10K Superscalar Processor

Software Pipelining and Superblock Scheduling: Compilation Techniques for VLIW Machines

عنوان ژورنال:

اشتراک گذاری